-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide CAII batch embedding for better performance #35
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I think we should get this in, even if we haven't figured out how to test it yet. We can test it in-situ at least for starters. |
jkwatson
approved these changes
Dec 5, 2024
ewilliams-cloudera
added a commit
that referenced
this pull request
Dec 12, 2024
* upgrade everything * small refactor for params, update loading * add bedrock converse * fix loading * Clean up Cohere suggested questions * Add property-based test for process_response() (#56) * Add hypothesis * Add property-based test for process_response() * Shorten variable * Formatting * Add type annotations * Fix type annotation * hacking on startup scripts * hacking on startup scripts, moar * fix wrong dir * try having the java side restart itself if it dies * see output from java startup * add debug info * add the executable bit * change the flags * Add docstrings for tests * refactor datasourceId * update to exclude 405b model and default to 8b * update readme for new cohere * fix broken tests monkeypatching * "wip on creating with models and response chunks" * wip on modal updates * commit java updates * wip on populating the chat setting modal * set up ui for updating a session * add update method * use updated session for chat * remove query configuration from the chat context * refactoring fe and fixing bug with empty model * remove the datasource id from the context and use the active session instead * Update release version to 1.4.0-beta * Support multiple embedding models (#59) * add embedding model to the data source in the java API * embedding model used from the datasource while indexing * replace the rest of the embedding model defaults * "test & fix bugs with embedding variability" * small refactoring to make embedding & llm caii methods look the same * fix linting issues * add a todo for a failing property test case * remove unused import --------- Co-authored-by: Elijah Williams <ewilliams@cloudera.com> * Provide CAII batch embedding for better performance (#35) * CAII endpoint discovery (#60) * "wip on endpoint listing" * "wip on list_endpoints typing" * "refactoring to endpoint object" * "wip filtering" * "endpoints queried!" * "refactoring" * "wip on cleaning up types" * "type cleanup complete" * "moving files" * "use a dummy embedding model for deletes" * fix some bits from merge, get evals working again with CAII, tests passing * formatting * clean up ruff stuff * use the chat llm for evals * fix mypy for reformatting * "wip on java reconciler" * "reconciler don't do no model; start python work" * "python - updating for summarization model" * "comment out batch embeddings to get it working again" * add handling for no summarization in the files table * finish up ui and python for summarization * make sure to update the time-updated fields on data sources and chat sessions * use no-op models when we don't need real ones for summary functionality * Update release version to dev-testing * use the summarization llm when summarizing summaries --------- Co-authored-by: Elijah Williams <ewilliams@cloudera.com> Co-authored-by: actions-user <actions@github.com> * Update release version to 1.4.0 * pass the original filename from java-> python so we don't need s3 metadata to store it * don't read the whole directory when summarizing docs * "refactor java to use RagFileService" * remove seaweedfs experiment * Make mypy happy (#62) * Refactor summary index to isolate the logic (#63) * Refactor summary index to isolate the logic * fix tests * handle race condition * handle mypy * ignore errors if the directory doesn't exist --------- Co-authored-by: jwatson <jkwatson@gmail.com> * image * Update catalog entry to match the official one (#66) * Update local catalog with official info * add the git-ref back * add the html long description (#67) * Shuffle API for data sources for easier human consumption (#68) * Shuffle API for data sources for easier human consumption * make mypy happy * remove prints * wip o fs rag file uploader * "now we're thinking with overtime" * Revert ""now we're thinking with overtime"" This reverts commit 3c93206. * get the databases directory from the environment (in local_dev) python file storage abstraction python tests currently broken real AMP startup script needs new env var * add a todo * merge from main * properly override the configuration in pytest configure to point at a temp directory * get the tests passing with filesystem file handoff * update project metadata to support new local filesystem storage * Update release version to dev-testing * fix java * cleanup after switching tests to use the local filesystem * Remove unused settings (#70) * remove unused dep * fix circular dep and refactor doc storage * Update release version to 1.4.0 * Summarize the data store on every document summarization (#69) * fix bug with s3 path when the prefix is not provided (#72) * add --reload to the fastapi startup_app script * Avoid global variables and use ephemeral folder for tests (#71) * Avoid global variables and use ephemeral folder for tests * fix with merge to main * Remove print * lint * refetch knowledge base summary on doc summary change * Bump @eslint/plugin-kit (#16) Bumps the npm_and_yarn group with 1 update in the /ui directory: [@eslint/plugin-kit](https://github.com/eslint/rewrite). Updates `@eslint/plugin-kit` from 0.2.2 to 0.2.3 - [Release notes](https://github.com/eslint/rewrite/releases) - [Changelog](https://github.com/eslint/rewrite/blob/main/release-please-config.json) - [Commits](eslint/rewrite@plugin-kit-v0.2.2...plugin-kit-v0.2.3) --- updated-dependencies: - dependency-name: "@eslint/plugin-kit" dependency-type: indirect dependency-group: npm_and_yarn ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: jwatson <jkwatson@gmail.com> Co-authored-by: Michael Liu <mliu@cloudera.com> Co-authored-by: actions-user <actions@github.com> Co-authored-by: conradocloudera <csilvamiranda@cloudera.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently on draft as I'm not sure how to test